Remove temporary allocations by Mikolaj-A-Kowalski · Pull Request #39 · DEMENT-Model/DEMENTpy

Mikolaj-A-Kowalski · 2025-06-24T14:37:31Z

Builds on #34, related to #22

There was an implicit conversion to float64 taking place due to a cast to the Python's list (and hence Python's float). This resulted in a serious (factor ~10) degradation in performance, which should now be fixed.

The problem was ultimately related to how pandas handles partial assignments in DataFrames and Series. If the data is assigned to from a higher precision expression, the dtype of the Series may be implicitly changed, which causes a temporary memory allocation. For example:

import pandas as pd
import numpy as np
    
df = pd.DataFrame({"f32": np.ones(2, dtype="float32")})

# Here  `dtype` of df will change to float64 
# Temporary memory allocation is taking place
df.iloc[1:2] = np.float64(1/3)

Please note that the behaviour is a bit peculiar and perhaps unintuitive. Pandas will not change dtype if it is "not necessary", that is the new value can be represented exactly in the prior dtype. e.g. :

import pandas as pd
import numpy as np
    
df = pd.DataFrame({"f32": np.ones(2, dtype="float32")})

# Here  `dtype` of df will  remain float32 since 2.0 can be represented exactly
df.iloc[1:2] = np.float64(2.0)

The implicit conversion to float64 was causing some DataFrames to change their dtype to "float64", which in turn caused Max_Uptake (in grid.uptake method) getting repeatability changed from "float32" to "float64", which appears to have been the main performance bottleneck.

Tagging @bioatmosphere explicitly since you were interested during the meeting yesterday ;-)

There was an implicit conversion to float64 taking place due to a cast to the Python's list (and hence Python's float). This resulted in a serious (factor ~10) degradation in performance, which should now be fixed. The performance degradation was a result of temporary allocation performed by pandas when a dtype of a frame was implicitly changed in updates of the form e.g.: ```python import pandas as pd import numpy as np df = pd.DataFrame({"f32": np.ones(2, dtype="float32")}) df.iloc[1:2] = np.float64(1/3) ``` since pandas 2.1, such operations raise a FutureWarning. All occurences of that warning in DEMENTpy are resolved in this commit.

Mikolaj-A-Kowalski · 2025-06-24T14:42:15Z

src/grid.py

        choose_taxa = np.zeros((self.n_taxa,self.gridsize), dtype='int8')
        for i in range(self.n_taxa):
-            choose_taxa[i,:] = np.random.choice([1,0], self.gridsize, replace=True, p=[frequencies[i], 1-frequencies[i]])
+            choose_taxa[i,:] = np.random.binomial(1, frequencies[i], self.gridsize)


When I changed the numbers back to "float32" this sampling started failing saying that the probabilities do not sum to 1.0. I presume they must be converted to "float64" before getting summed inside np.random.random_choice.

I have changed the sampling to binominal, which i believe should be equivalent and would avoid the summation errors.

Mikolaj-A-Kowalski · 2025-06-24T14:43:17Z

src/grid.py

            [Enzyme_Loss,
-            Enzyme_Loss.mul(self.Enz_Attrib['N_cost'].tolist()*self.gridsize,axis=0),
-            Enzyme_Loss.mul(self.Enz_Attrib['P_cost'].tolist()*self.gridsize,axis=0)],
+            Enzyme_Loss.mul(np.repeat(self.Enz_Attrib['N_cost'].values, self.gridsize), axis=0),


The 'float64' was appearing here.

tolist method would return a list of Python's floats, which are double precision.

sjavis and others added 4 commits June 6, 2025 18:07

Added pytest-profiling dep. Created unit test folder

0ed8d4d

Add profiling test

14b5af5

Prevent dementpy from running automatically on import

10731c0

Mikolaj-A-Kowalski force-pushed the iss22-remove-f64 branch from 62fd4ad to 1499480 Compare June 24, 2025 14:41

Mikolaj-A-Kowalski commented Jun 24, 2025

View reviewed changes

jgwalkup marked this pull request as ready for review September 2, 2025 14:49

jgwalkup self-requested a review September 30, 2025 18:30

jgwalkup approved these changes Nov 12, 2025

View reviewed changes

jgwalkup merged commit a5f8711 into main Nov 12, 2025
3 checks passed

jgwalkup deleted the iss22-remove-f64 branch November 12, 2025 14:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove temporary allocations#39

Remove temporary allocations#39
jgwalkup merged 4 commits intomainfrom
iss22-remove-f64

Mikolaj-A-Kowalski commented Jun 24, 2025 •

edited

Loading

Uh oh!

Mikolaj-A-Kowalski Jun 24, 2025

Uh oh!

Mikolaj-A-Kowalski Jun 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Mikolaj-A-Kowalski commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mikolaj-A-Kowalski Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

Mikolaj-A-Kowalski Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Mikolaj-A-Kowalski commented Jun 24, 2025 •

edited

Loading

Mikolaj-A-Kowalski Jun 24, 2025 •

edited

Loading